Inventi Impact: Bioinformatics

Articles

Inventi:ebi/18/12

LACK OF SUFFICIENTLY STRONG INFORMATIVE FEATURES LIMITS THE POTENTIAL OF GENE EXPRESSION ANALYSIS AS PREDICTIVE TOOL FOR MANY CLINICAL CLASSIFICATION PROBLEMS

31-Mar-2012 Research 2012 : April - June

Kenneth R Hess, Caimiao Wei, Yuan Qi, Takayuki Iwamoto, W Fraser Symmans, Lajos Pusztai

Background: Our goal was to examine how various aspects of a gene signature influence the success of\r\ndeveloping multi-gene prediction models. We inserted gene signatures into three real data sets by altering the\r\nexpression level of existing probe sets. We varied the number of probe sets perturbed (signature size), the fold\r\nincrease of mean probe set expression in perturbed compared to unperturbed data (signature strength) and the\r\nnumber of samples perturbed. Prediction models were trained to identify which cases had been perturbed.\r\nPerformance was estimated using Monte-Carlo cross validation.\r\nResults: Signature strength had the greatest influence on predictor performance. It was possible to develop almost\r\nperfect predictors with as few as 10 features if the fold difference in mean expression values were > 2 even when\r\nthe spiked samples represented 10% of all samples. We also assessed the gene signature set size and strength for\r\n9 real clinical prediction problems in six different breast cancer data sets.\r\nConclusions: We found sufficiently large and strong predictive signatures only for distinguishing ER-positive from\r\nER-negative cancers, there were no strong signatures for more subtle prediction problems. Current statistical\r\nmethods efficiently identify highly informative features in gene expression data if such features exist and accurate\r\nmodels can be built with as few as 10 highly informative features. Features can be considered highly informative if\r\nat least 2-fold expression difference exists between comparison groups but such features do not appear to be\r\ncommon for many clinically relevant prediction problems in human data sets.

How to Cite this Article
CC Compliant Citation: Hess et al.: Lack of sufficiently strong informative features limits the potential of gene expression analysis as\r\npredictive tool for many clinical classification problems. BMC Bioinformatics 2011 12:463. doi:10.1186/1471-2105-12-463.
Download Full Text

Call Us: +4 (800) 888-0008

Inventi Impact: Bioinformatics

Articles

Inventi:ebi/18/12

LACK OF SUFFICIENTLY STRONG INFORMATIVE FEATURES LIMITS THE POTENTIAL OF GENE EXPRESSION ANALYSIS AS PREDICTIVE TOOL FOR MANY CLINICAL CLASSIFICATION PROBLEMS

How to Cite this Article

Links

Contact Us